Here, we’re just setting a few options.
knitr::opts_chunk$set(
warning = TRUE, # show warnings during codebook generation
message = TRUE, # show messages during codebook generation
error = TRUE, # do not interrupt codebook generation in case of errors,
# usually better for debugging
echo = TRUE # show R code
)
ggplot2::theme_set(ggplot2::theme_bw())
Now, we’re preparing our data for the codebook.
library(codebook)
codebook_data <- codebook::bfi
# to import an SPSS file from the same folder uncomment and edit the line below
# codebook_data <- rio::import("mydata.sav")
# for Stata
# codebook_data <- rio::import("mydata.dta")
# for CSV
# codebook_data <- rio::import("mydata.csv")
# omit the following lines, if your missing values are already properly labelled
codebook_data <- detect_missing(codebook_data,
only_labelled = TRUE, # only labelled values are autodetected as
# missing
negative_values_are_missing = FALSE, # negative values are missing values
ninety_nine_problems = TRUE, # 99/999 are missing values, if they
# are more than 5 MAD from the median
)
# If you are not using formr, the codebook package needs to guess which items
# form a scale. The following line finds item aggregates with names like this:
# scale = scale_1 + scale_2R + scale_3R
# identifying these aggregates allows the codebook function to
# automatically compute reliabilities.
# However, it will not reverse items automatically.
codebook_data <- rio::import("C:/Users/Omgjk/OneDrive - Emory University/Work/Lopman/Codebooks/participant_factor.rds")
Create codebook
codebook(codebook_data)
Dataset name: codebook_data
The dataset has N=304 rows and 9 columns. 303 rows have no missing values on any column.
|
#Variables
Distribution of values for part_id
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| part_id | numeric | 0 | 1 | 1 | 152 | 304 | 152.5 | 87.90146 | <U+2587><U+2587><U+2587><U+2587><U+2587> | NA |
Distribution of values for age
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| age | numeric | 0 | 1 | 21 | 36 | 78 | 39.47368 | 12.96196 | <U+2587><U+2585><U+2583><U+2582><U+2581> | NA |
Distribution of values for age_cat
0 missing values.
| name | data_type | ordered | value_labels | n_missing | complete_rate | n_unique | top_counts | label |
|---|---|---|---|---|---|---|---|---|
| age_cat | factor | FALSE | 1. 20-29, 2. 30-39, 3. 40-49, 4. 50-59, 5. 60+ |
0 | 1 | 5 | 20-: 90, 30-: 76, 40-: 60, 50-: 49 | NA |
Distribution of values for gender
0 missing values.
| name | data_type | ordered | value_labels | n_missing | complete_rate | n_unique | top_counts | label |
|---|---|---|---|---|---|---|---|---|
| gender | factor | FALSE | 1. Female, 2. Male, 3. Prefer not to answer |
0 | 1 | 3 | Fem: 184, Mal: 116, Pre: 4 | NA |
Distribution of values for race
0 missing values.
| name | data_type | ordered | value_labels | n_missing | complete_rate | n_unique | top_counts | label |
|---|---|---|---|---|---|---|---|---|
| race | factor | FALSE | 1. Black, 2. White, 3. Asian, 4. Mixed, 5. Other |
0 | 1 | 5 | Whi: 174, Mix: 52, Asi: 48, Bla: 26 | NA |
Distribution of values for hispanic
0 missing values.
| name | data_type | ordered | value_labels | n_missing | complete_rate | n_unique | top_counts | label |
|---|---|---|---|---|---|---|---|---|
| hispanic | factor | FALSE | 1. None of these, 2. Yes |
0 | 1 | 2 | Non: 290, Yes: 14 | NA |
Distribution of values for edu
1 missing values.
| name | data_type | ordered | value_labels | n_missing | complete_rate | n_unique | top_counts | label |
|---|---|---|---|---|---|---|---|---|
| edu | factor | FALSE | 1. Associate degree in college (2-year), 2. Bachelor’s degree in college (4-year), 3. Doctoral degree or Professional degree (PhD, JD, MD), 4. High school graduate (high school diploma or equivalent including GED), 5. Master’s degree, 6. Some college but no degree |
1 | 0.9967105 | 6 | Mas: 146, Bac: 118, Doc: 22, Som: 7 | NA |
Distribution of values for hh_str
0 missing values.
| name | data_type | ordered | value_labels | n_missing | complete_rate | n_unique | top_counts | label |
|---|---|---|---|---|---|---|---|---|
| hh_str | factor | FALSE | 1. Live alone, 2. Live with parent, 3. Other, 4. Roommate or sibling, 5. Spouse and children only, 6. Spouse only |
0 | 1 | 6 | Spo: 97, Spo: 76, Liv: 44, Roo: 39 | NA |
Distribution of values for state_res2
0 missing values.
| name | data_type | ordered | value_labels | n_missing | complete_rate | n_unique | top_counts | label |
|---|---|---|---|---|---|---|---|---|
| state_res2 | factor | FALSE | 1. Georgia, 2. Illinois, 3. Other, 4. Virginia |
0 | 1 | 4 | Geo: 148, Oth: 98, Vir: 30, Ill: 28 | NA |
The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.
{
"name": "codebook_data",
"datePublished": "2020-08-24",
"description": "The dataset has N=304 rows and 9 columns.\n303 rows have no missing values on any column.\n\n\n## Table of variables\nThis table contains variable names, labels, and number of missing values.\nSee the complete codebook for more.\n\n|name |label | n_missing|\n|:----------|:-----|---------:|\n|part_id |NA | 0|\n|age |NA | 0|\n|age_cat |NA | 0|\n|gender |NA | 0|\n|race |NA | 0|\n|hispanic |NA | 0|\n|edu |NA | 1|\n|hh_str |NA | 0|\n|state_res2 |NA | 0|\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.9.2).",
"keywords": ["part_id", "age", "age_cat", "gender", "race", "hispanic", "edu", "hh_str", "state_res2"],
"@context": "http://schema.org/",
"@type": "Dataset",
"variableMeasured": [
{
"name": "part_id",
"@type": "propertyValue"
},
{
"name": "age",
"@type": "propertyValue"
},
{
"name": "age_cat",
"value": "1. 20-29,\n2. 30-39,\n3. 40-49,\n4. 50-59,\n5. 60+",
"@type": "propertyValue"
},
{
"name": "gender",
"value": "1. Female,\n2. Male,\n3. Prefer not to answer",
"@type": "propertyValue"
},
{
"name": "race",
"value": "1. Black,\n2. White,\n3. Asian,\n4. Mixed,\n5. Other",
"@type": "propertyValue"
},
{
"name": "hispanic",
"value": "1. None of these,\n2. Yes",
"@type": "propertyValue"
},
{
"name": "edu",
"value": "1. Associate degree in college (2-year),\n2. Bachelor's degree in college (4-year),\n3. Doctoral degree or Professional degree (PhD, JD, MD),\n4. High school graduate (high school diploma or equivalent including GED),\n5. Master's degree,\n6. Some college but no degree",
"@type": "propertyValue"
},
{
"name": "hh_str",
"value": "1. Live alone,\n2. Live with parent,\n3. Other,\n4. Roommate or sibling,\n5. Spouse and children only,\n6. Spouse only",
"@type": "propertyValue"
},
{
"name": "state_res2",
"value": "1. Georgia,\n2. Illinois,\n3. Other,\n4. Virginia",
"@type": "propertyValue"
}
]
}`